Finding the Best Approach for Multi-lingual Text Summarisation: A Comparative Analysis

نویسندگان

Elena Lloret

Manuel Palomar

چکیده

This paper addresses the problem of multilingual text summarisation. The goal is to analyse three approaches for generating summaries in four languages (English, Spanish, German and French), in order to determine the best one to adopt when tackling this issue. The proposed approaches rely on: i) language-independent techniques; ii) language-specific resources; and iii) machine translation resources applied to a mono-lingual summariser. The evaluation carried out employing the JRC corpus – a corpus specifically created for multi-lingual summarisation – shows that the approach which uses languagespecific resources is the most appropriate in our comparison framework, performing better than state-of-the-art multi-lingual summarisers. Moreover, the readability assessment conducted over the resulting summaries for this approach proves that they are also very competitive with respect to their quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

English-Persian Plagiarism Detection based on a Semantic Approach

Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...

متن کامل

Qualitative and Quantitative Examination of Text Type Readabilities: A Comparative Analysis

This study compared 2 main approaches to readability assessment. Thequantitative approach applied idea density based on part of speech tagging andcompared 3 sets of text types (i.e., narrative, expository, and argumentative) withrespect to their ease of reading. The qualitative approach was done throughdeveloping questionnaires measuring intermediate EFL learners’ perceptions oncontent, motivat...

متن کامل

Development of a Corpus for Evidence Based Medicine Summarisation

In this paper we introduce some of the key NLP-related problems related to the practice of Evidence Based Medicine and propose the task of multi-document query-focused summarisation as a key approach to solve these problems. We have completed a corpus for the development of such multi-document queryfocused summarisation task. The process to build the corpus combined the use of automated extract...

متن کامل

Comparative Study of Verse Similarity for Multi-lingual Representations of the Qur’an

Text similarity is a subject that has received great attention in recent years. However, the application of text similarity tools to Semitic languages such as Arabic faces unique challenges. Moreover, the increasing number of texts being made available online, not only in native languages but also in translation, adds further challenge to identifying similar portions of texts across different d...

متن کامل

Automatic Annotation of Corpora for Text Summarisation: A Comparative Study

This paper presents two methods which automatically produce annotated corpora for text summarisation on the basis of human produced abstracts. Both methods identify a set of sentences from the document which conveys the information in the human produced abstract best. The first method relies on a greedy algorithm, whilst the second one uses a genetic algorithm. The methods allow to specify the ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Finding the Best Approach for Multi-lingual Text Summarisation: A Comparative Analysis

نویسندگان

چکیده

منابع مشابه

English-Persian Plagiarism Detection based on a Semantic Approach

Qualitative and Quantitative Examination of Text Type Readabilities: A Comparative Analysis

Development of a Corpus for Evidence Based Medicine Summarisation

Comparative Study of Verse Similarity for Multi-lingual Representations of the Qur’an

Automatic Annotation of Corpora for Text Summarisation: A Comparative Study

عنوان ژورنال:

اشتراک گذاری